The search functionality is under construction.

Keyword Search Result

[Keyword] neural network(855hit)

201-220hit(855hit)

  • SDChannelNets: Extremely Small and Efficient Convolutional Neural Networks

    JianNan ZHANG  JiJun ZHOU  JianFeng WU  ShengYing YANG  

     
    LETTER-Biocybernetics, Neurocomputing

      Pubricized:
    2019/09/10
      Vol:
    E102-D No:12
      Page(s):
    2646-2650

    Convolutional neural networks (CNNS) have a strong ability to understand and judge images. However, the enormous parameters and computation of CNNS have limited its application in resource-limited devices. In this letter, we used the idea of parameter sharing and dense connection to compress the parameters in the convolution kernel channel direction, thus greatly reducing the number of model parameters. On this basis, we designed Shared and Dense Channel-wise Convolutional Networks (SDChannelNets), mainly composed of Depth-wise Separable SD-Channel-wise Convolution layer. The advantage of SDChannelNets is that the number of model parameters is greatly reduced without or with little loss of accuracy. We also introduced a hyperparameter that can effectively balance the number of parameters and the accuracy of a model. We evaluated the model proposed by us through two popular image recognition tasks (CIFAR-10 and CIFAR-100). The results showed that SDChannelNets had similar accuracy to other CNNs, but the number of parameters was greatly reduced.

  • A Spectral Clustering Based Filter-Level Pruning Method for Convolutional Neural Networks

    Lianqiang LI  Jie ZHU  Ming-Ting SUN  

     
    LETTER-Artificial Intelligence, Data Mining

      Pubricized:
    2019/09/17
      Vol:
    E102-D No:12
      Page(s):
    2624-2627

    Convolutional Neural Networks (CNNs) usually have millions or even billions of parameters, which make them hard to be deployed into mobile devices. In this work, we present a novel filter-level pruning method to alleviate this issue. More concretely, we first construct an undirected fully connected graph to represent a pre-trained CNN model. Then, we employ the spectral clustering algorithm to divide the graph into some subgraphs, which is equivalent to clustering the similar filters of the CNN into the same groups. After gaining the grouping relationships among the filters, we finally keep one filter for one group and retrain the pruned model. Compared with previous pruning methods that identify the redundant filters by heuristic ways, the proposed method can select the pruning candidates more reasonably and precisely. Experimental results also show that our proposed pruning method has significant improvements over the state-of-the-arts.

  • Dither NN: Hardware/Algorithm Co-Design for Accurate Quantized Neural Networks

    Kota ANDO  Kodai UEYOSHI  Yuka OBA  Kazutoshi HIROSE  Ryota UEMATSU  Takumi KUDO  Masayuki IKEBE  Tetsuya ASAI  Shinya TAKAMAEDA-YAMAZAKI  Masato MOTOMURA  

     
    PAPER-Computer System

      Pubricized:
    2019/07/22
      Vol:
    E102-D No:12
      Page(s):
    2341-2353

    Deep neural network (NN) has been widely accepted for enabling various AI applications, however, the limitation of computational and memory resources is a major problem on mobile devices. Quantized NN with a reduced bit precision is an effective solution, which relaxes the resource requirements, but the accuracy degradation due to its numerical approximation is another problem. We propose a novel quantized NN model employing the “dithering” technique to improve the accuracy with the minimal additional hardware requirement at the view point of the hardware-algorithm co-designing. Dithering distributes the quantization error occurring at each pixel (neuron) spatially so that the total information loss of the plane would be minimized. The experiment we conducted using the software-based accuracy evaluation and FPGA-based hardware resource estimation proved the effectiveness and efficiency of the concept of an NN model with dithering.

  • A Fast Fabric Defect Detection Framework for Multi-Layer Convolutional Neural Network Based on Histogram Back-Projection

    Guodong SUN  Zhen ZHOU  Yuan GAO  Yun XU  Liang XU  Song LIN  

     
    PAPER-Artificial Intelligence, Data Mining

      Pubricized:
    2019/08/26
      Vol:
    E102-D No:12
      Page(s):
    2504-2514

    In this paper we design a fast fabric defect detection framework (Fast-DDF) based on gray histogram back-projection, which adopts end to end multi-convoluted network model to realize defect classification. First, the back-projection image is established through the gray histogram on fabric image, and the closing operation and adaptive threshold segmentation method are performed to screen the impurity information and extract the defect regions. Then, the defect images segmented by the Fast-DDF are marked and normalized into the multi-layer convolutional neural network for training. Finally, in order to solve the problem of difficult adjustment of network model parameters and long training time, some strategies such as batch normalization of samples and network fine tuning are proposed. The experimental results on the TILDA database show that our method can deal with various defect types of textile fabrics. The average detection accuracy with a higher rate of 96.12% in the database of five different defects, and the single image detection speed only needs 0.72s.

  • Latent Words Recurrent Neural Network Language Models for Automatic Speech Recognition

    Ryo MASUMURA  Taichi ASAMI  Takanobu OBA  Sumitaka SAKAUCHI  Akinori ITO  

     
    PAPER-Speech and Hearing

      Pubricized:
    2019/09/25
      Vol:
    E102-D No:12
      Page(s):
    2557-2567

    This paper demonstrates latent word recurrent neural network language models (LW-RNN-LMs) for enhancing automatic speech recognition (ASR). LW-RNN-LMs are constructed so as to pick up advantages in both recurrent neural network language models (RNN-LMs) and latent word language models (LW-LMs). The RNN-LMs can capture long-range context information and offer strong performance, and the LW-LMs are robust for out-of-domain tasks based on the latent word space modeling. However, the RNN-LMs cannot explicitly capture hidden relationships behind observed words since a concept of a latent variable space is not present. In addition, the LW-LMs cannot take into account long-range relationships between latent words. Our idea is to combine RNN-LM and LW-LM so as to compensate individual disadvantages. The LW-RNN-LMs can support both a latent variable space modeling as well as LW-LMs and a long-range relationship modeling as well as RNN-LMs at the same time. From the viewpoint of RNN-LMs, LW-RNN-LM can be considered as a soft class RNN-LM with a vast latent variable space. In contrast, from the viewpoint of LW-LMs, LW-RNN-LM can be considered as an LW-LM that uses the RNN structure for latent variable modeling instead of an n-gram structure. This paper also details a parameter inference method and two kinds of implementation methods, an n-gram approximation and a Viterbi approximation, for introducing the LW-LM to ASR. Our experiments show effectiveness of LW-RNN-LMs on a perplexity evaluation for the Penn Treebank corpus and an ASR evaluation for Japanese spontaneous speech tasks.

  • Generating Accurate Candidate Windows by Effective Receptive Field

    Baojun ZHAO  Boya ZHAO  Linbo TANG  Baoxian WANG  

     
    LETTER-Image

      Vol:
    E102-A No:12
      Page(s):
    1925-1927

    Towards involving the convolutional neural networks into the object detection field, many computer vision tasks have achieved favorable successes. In order to adapt targets with various scales, deep feature pyramid is widely used, since the traditional object detection methods detect different objects in Gaussian image pyramid. However, due to the mismatching between the anchors and the feature distributions of targets, the accurate detection for targets with various scales is still a challenge. Considering the differences between the theoretical receptive field and effective receptive field, we propose a novel anchor generation method, which takes the effective receptive field as the standard. The proposed method is evaluated on the PASCAL VOC dataset and shows the favorable results.

  • Detecting Surface Defects of Wind Tubine Blades Using an Alexnet Deep Learning Algorithm Open Access

    Xiao-Yi ZHAO  Chao-Yi DONG  Peng ZHOU  Mei-Jia ZHU  Jing-Wen REN  Xiao-Yan CHEN  

     
    PAPER-Machine Learning

      Vol:
    E102-A No:12
      Page(s):
    1817-1824

    The paper employed an Alexnet, which is a deep learning framework, to automatically diagnose the damages of wind power generator blade surfaces. The original images of wind power generator blade surfaces were captured by machine visions of a 4-rotor UAV (unmanned aerial vehicle). Firstly, an 8-layer Alexnet, totally including 21 functional sub-layers, is constructed and parameterized. Secondly, the Alexnet was trained with 10000 images and then was tested by 6-turn 350 images. Finally, the statistic of network tests shows that the average accuracy of damage diagnosis by Alexnet is about 99.001%. We also trained and tested a traditional BP (Back Propagation) neural network, which have 20-neuron input layer, 5-neuron hidden layer, and 1-neuron output layer, with the same image data. The average accuracy of damage diagnosis of BP neural network is 19.424% lower than that of Alexnet. The point shows that it is feasible to apply the UAV image acquisition and the deep learning classifier to diagnose the damages of wind turbine blades in service automatically.

  • Discriminative Convolutional Neural Network for Image Quality Assessment with Fixed Convolution Filters

    Motohiro TAKAGI  Akito SAKURAI  Masafumi HAGIWARA  

     
    LETTER-Image Recognition, Computer Vision

      Pubricized:
    2019/08/09
      Vol:
    E102-D No:11
      Page(s):
    2265-2266

    Current image quality assessment (IQA) methods require the original images for evaluation. However, recently, IQA methods that use machine learning have been proposed. These methods learn the relationship between the distorted image and the image quality automatically. In this paper, we propose an IQA method based on deep learning that does not require a reference image. We show that a convolutional neural network with distortion prediction and fixed filters improves the IQA accuracy.

  • High Noise Tolerant R-Peak Detection Method Based on Deep Convolution Neural Network

    Menghan JIA  Feiteng LI  Zhijian CHEN  Xiaoyan XIANG  Xiaolang YAN  

     
    LETTER-Biological Engineering

      Pubricized:
    2019/08/02
      Vol:
    E102-D No:11
      Page(s):
    2272-2275

    An R-peak detection method with a high noise tolerance is presented in this paper. This method utilizes a customized deep convolution neural network (DCNN) to extract morphological and temporal features from sliced electrocardiogram (ECG) signals. The proposed network adopts multiple parallel dilated convolution layers to analyze features from diverse fields of view. A sliding window slices the original ECG signals into segments, and then the network calculates one segment at a time and outputs every point's probability of belonging to the R-peak regions. After a binarization and a deburring operation, the occurrence time of the R-peaks can be located. Experimental results based on the MIT-BIH database show that the R-peak detection accuracies can be significantly improved under high intensity of the electrode motion artifact or muscle artifact noise, which reveals a higher performance than state-of-the-art methods.

  • Vector Quantization of High-Dimensional Speech Spectra Using Deep Neural Network

    JianFeng WU  HuiBin QIN  YongZhu HUA  LiHuan SHAO  Ji HU  ShengYing YANG  

     
    LETTER-Artificial Intelligence, Data Mining

      Pubricized:
    2019/07/02
      Vol:
    E102-D No:10
      Page(s):
    2047-2050

    This paper proposes a deep neural network (DNN) based framework to address the problem of vector quantization (VQ) for high-dimensional data. The main challenge of applying DNN to VQ is how to reduce the binary coding error of the auto-encoder when the distribution of the coding units is far from binary. To address this problem, three fine-tuning methods have been adopted: 1) adding Gaussian noise to the input of the coding layer, 2) forcing the output of the coding layer to be binary, 3) adding a non-binary penalty term to the loss function. These fine-tuning methods have been extensively evaluated on quantizing speech magnitude spectra. The results demonstrated that each of the methods is useful for improving the coding performance. When implemented for quantizing 968-dimensional speech spectra using only 18-bit, the DNN-based VQ framework achieved an averaged PESQ of about 2.09, which is far beyond the capability of conventional VQ methods.

  • Hardware-Based Principal Component Analysis for Hybrid Neural Network Trained by Particle Swarm Optimization on a Chip

    Tuan Linh DANG  Yukinobu HOSHINO  

     
    PAPER-Neural Networks and Bioengineering

      Vol:
    E102-A No:10
      Page(s):
    1374-1382

    This paper presents a hybrid architecture for a neural network (NN) trained by a particle swarm optimization (PSO) algorithm. The NN is implemented on the hardware side while the PSO is executed by a processor on the software side. In addition, principal component analysis (PCA) is also applied to reduce correlated information. The PCA module is implemented in hardware by the SystemVerilog programming language to increase operating speed. Experimental results showed that the proposed architecture had been successfully implemented. In addition, the hardware-based NN trained by PSO (NN-PSO) program was faster than the software-based NN trained by the PSO program. The proposed NN-PSO with PCA also obtained better recognition rates than the NN-PSO without-PCA.

  • Low-Cost Method for Recognizing Table Tennis Activity

    Se-Min LIM  Jooyoung PARK  Hyeong-Cheol OH  

     
    LETTER-Artificial Intelligence, Data Mining

      Pubricized:
    2019/06/18
      Vol:
    E102-D No:10
      Page(s):
    2051-2054

    This study designs a low-cost portable device that functions as a coaching assistant system which can support table tennis practice. Although deep learning technology is a promising solution to realizing human activity recognition, we propose using cosine similarity in making inferences. Our experiments show that the cosine similarity based inference can be a good alternative to the deep learning based inference for the assistant system when resources are limited.

  • LGCN: Learnable Gabor Convolution Network for Human Gender Recognition in the Wild Open Access

    Peng CHEN  Weijun LI  Linjun SUN  Xin NING  Lina YU  Liping ZHANG  

     
    LETTER-Image Recognition, Computer Vision

      Pubricized:
    2019/06/13
      Vol:
    E102-D No:10
      Page(s):
    2067-2071

    Human gender recognition in the wild is a challenging task due to complex face variations, such as poses, lighting, occlusions, etc. In this letter, learnable Gabor convolutional network (LGCN), a new neural network computing framework for gender recognition was proposed. In LGCN, a learnable Gabor filter (LGF) is introduced and combined with the convolutional neural network (CNN). Specifically, the proposed framework is constructed by replacing some first layer convolutional kernels of a standard CNN with LGFs. Here, LGFs learn intrinsic parameters by using standard back propagation method, so that the values of those parameters are no longer fixed by experience as traditional methods, but can be modified by self-learning automatically. In addition, the performance of LGCN in gender recognition is further improved by applying a proposed feature combination strategy. The experimental results demonstrate that, compared to the standard CNNs with identical network architecture, our approach achieves better performance on three challenging public datasets without introducing any sacrifice in parameter size.

  • Multi Model-Based Distillation for Sound Event Detection Open Access

    Yingwei FU  Kele XU  Haibo MI  Qiuqiang KONG  Dezhi WANG  Huaimin WANG  Tie HONG  

     
    LETTER-Artificial Intelligence, Data Mining

      Pubricized:
    2019/07/08
      Vol:
    E102-D No:10
      Page(s):
    2055-2058

    Sound event detection is intended to identify the sound events in audio recordings, which has widespread applications in real life. Recently, convolutional recurrent neural network (CRNN) models have achieved state-of-the-art performance in this task due to their capabilities in learning the representative features. However, the CRNN models are of high complexities with millions of parameters to be trained, which limits their usage for the mobile and embedded devices with limited computation resource. Model distillation is effective to distill the knowledge of a complex model to a smaller one, which can be deployed on the devices with limited computational power. In this letter, we propose a novel multi model-based distillation approach for sound event detection by making use of the knowledge from models of multiple teachers which are complementary in detecting sound events. Extensive experimental results demonstrated that our approach achieves a compression ratio about 50 times. In addition, better performance is obtained for the sound event detection task.

  • A Deep Learning Approach to Writer Identification Using Inertial Sensor Data of Air-Handwriting

    Yanfang DING  Yang XUE  

     
    LETTER-Pattern Recognition

      Pubricized:
    2019/07/18
      Vol:
    E102-D No:10
      Page(s):
    2059-2063

    To the best of our knowledge, there are a few researches on air-handwriting character-level writer identification only employing acceleration and angular velocity data. In this paper, we propose a deep learning approach to writer identification only using inertial sensor data of air-handwriting. In particular, we separate different representations of degree of freedom (DoF) of air-handwriting to extract local dependency and interrelationship in different CNNs separately. Experiments on a public dataset achieve an average good performance without any extra hand-designed feature extractions.

  • Cross-Domain Deep Feature Combination for Bird Species Classification with Audio-Visual Data

    Naranchimeg BOLD  Chao ZHANG  Takuya AKASHI  

     
    PAPER-Multimedia Pattern Processing

      Pubricized:
    2019/06/27
      Vol:
    E102-D No:10
      Page(s):
    2033-2042

    In recent decade, many state-of-the-art algorithms on image classification as well as audio classification have achieved noticeable successes with the development of deep convolutional neural network (CNN). However, most of the works only exploit single type of training data. In this paper, we present a study on classifying bird species by exploiting the combination of both visual (images) and audio (sounds) data using CNN, which has been sparsely treated so far. Specifically, we propose CNN-based multimodal learning models in three types of fusion strategies (early, middle, late) to settle the issues of combining training data cross domains. The advantage of our proposed method lies on the fact that we can utilize CNN not only to extract features from image and audio data (spectrogram) but also to combine the features across modalities. In the experiment, we train and evaluate the network structure on a comprehensive CUB-200-2011 standard data set combing our originally collected audio data set with respect to the data species. We observe that a model which utilizes the combination of both data outperforms models trained with only an either type of data. We also show that transfer learning can significantly increase the classification performance.

  • Character-Level Convolutional Neural Network for Predicting Severity of Software Vulnerability from Vulnerability Description

    Shunta NAKAGAWA  Tatsuya NAGAI  Hideaki KANEHARA  Keisuke FURUMOTO  Makoto TAKITA  Yoshiaki SHIRAISHI  Takeshi TAKAHASHI  Masami MOHRI  Yasuhiro TAKANO  Masakatu MORII  

     
    LETTER-Cybersecurity

      Pubricized:
    2019/06/21
      Vol:
    E102-D No:9
      Page(s):
    1679-1682

    System administrators and security officials of an organization need to deal with vulnerable IT assets, especially those with severe vulnerabilities, to minimize the risk of these vulnerabilities being exploited. The Common Vulnerability Scoring System (CVSS) can be used as a means to calculate the severity score of vulnerabilities, but it currently requires human operators to choose input values. A word-level Convolutional Neural Network (CNN) has been proposed to estimate the input parameters of CVSS and derive the severity score of vulnerability notes, but its accuracy needs to be improved further. In this paper, we propose a character-level CNN for estimating the severity scores. Experiments show that the proposed scheme outperforms conventional one in terms of accuracy and how errors occur.

  • Multi-Level Attention Based BLSTM Neural Network for Biomedical Event Extraction

    Xinyu HE  Lishuang LI  Xingchen SONG  Degen HUANG  Fuji REN  

     
    PAPER-Natural Language Processing

      Pubricized:
    2019/04/26
      Vol:
    E102-D No:9
      Page(s):
    1842-1850

    Biomedical event extraction is an important and challenging task in Information Extraction, which plays a key role for medicine research and disease prevention. Most of the existing event detection methods are based on shallow machine learning methods which mainly rely on domain knowledge and elaborately designed features. Another challenge is that some crucial information as well as the interactions among words or arguments may be ignored since most works treat words and sentences equally. Therefore, we employ a Bidirectional Long Short Term Memory (BLSTM) neural network for event extraction, which can skip handcrafted complex feature extraction. Furthermore, we propose a multi-level attention mechanism, including word level attention which determines the importance of words in a sentence, and the sentence level attention which determines the importance of relevant arguments. Finally, we train dependency word embeddings and add sentence vectors to enrich semantic information. The experimental results show that our model achieves an F-score of 59.61% on the commonly used dataset (MLEE) of biomedical event extraction, which outperforms other state-of-the-art methods.

  • A New Method for Futures Price Trends Forecasting Based on BPNN and Structuring Data

    Weijun LU  Chao GENG  Dunshan YU  

     
    LETTER-Artificial Intelligence, Data Mining

      Pubricized:
    2019/05/28
      Vol:
    E102-D No:9
      Page(s):
    1882-1886

    Forecasting commodity futures price is a challenging task. We present an algorithm to predict the trend of commodity futures price based on a type of structuring data and back propagation neural network. The random volatility of futures can be filtered out in the structuring data. Moreover, it is not restricted by the type of futures contract. Experiments show the algorithm can achieve 80% accuracy in predicting price trends.

  • Speech Quality Enhancement for In-Ear Microphone Based on Neural Network

    Hochong PARK  Yong-Shik SHIN  Seong-Hyeon SHIN  

     
    LETTER-Speech and Hearing

      Pubricized:
    2019/05/15
      Vol:
    E102-D No:8
      Page(s):
    1594-1597

    Speech captured by an in-ear microphone placed inside an occluded ear has a high signal-to-noise ratio; however, it has different sound characteristics compared to normal speech captured through air conduction. In this study, a method for blind speech quality enhancement is proposed that can convert speech captured by an in-ear microphone to one that resembles normal speech. The proposed method estimates an input-dependent enhancement function by using a neural network in the feature domain and enhances the captured speech via time-domain filtering. Subjective and objective evaluations confirm that the speech enhanced using our proposed method sounds more similar to normal speech than that enhanced using conventional equalizer-based methods.

201-220hit(855hit)